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Abstract. The National Aeronautics and Space Administration Global 
Modeling and Assimilation Office (NASA/GMAO) observing system sim- 
ulation experiment (OSSE) framework is used to explore the response of anal- 
ysis error and forecast skill to observation quality. In an OSSE, synthetic ob- 
servations may be created that have much smaller error than real observa- 
tions, and precisely quantified error may be applied to these synthetic ob- 
servations. Three experiments are performed in which synthetic observations 
with magnitudes of applied observation error that vary from zero to twice 
the estimated realistic error are ingested into the Goddard Earth Observ- 
ing System Model (GEOS-5) with Gridpoint Statistical Interpolation (GSI) 
data assimilation for a one-month period representing July. The analysis in- 
crement and observation innovation are strongly impacted by observation 
error, with much larger variances for increased observation error. The anal- 
ysis quality is degraded by increased observation error, but the change in root- 
mean-square error of the analysis state is small relative to the total analy- 
sis error. Surprisingly, in the 120 hour forecast, increased observation error 
only yields a slight decline in forecast skill in the extratropics and no dis- 
cernible degradation of forecast skill in the tropics. 


©2013 American Geophysical Union. All Rights Reserved. 


Accepted Article 


1. Introduction 


There are multiple sources of error in numerical weather analysis and prediction includ- 
ing model error, observation instrument and representativeness error, errors introduced 
by the data assimilation process itself, and physical-dynamical error growth. Because the 
true state of the atmosphere remains unknown, it is not possible to directly assess these 
errors or their impact on analysis quality or forecast skill. Many efforts have been made 
to investigate the impact of initial condition errors on forecast skill, such as with idealized 
identical or fraternal twin experiments (e.g. Tribbia and Baumhefner [2004]), but these 
studies have not considered errors in the context of data assimilation systems. 

Previous studies (e.g. Tyndall et al. [2010], Irvine et al. [2011]) have examined the role 
of observation error in data assimilation, primarily in the form of the weighting of obser- 
vational data versus the background. Changing the specified observation error variance 
or background error variance in a data assimilation system (DAS) alters how closely the 
analysis held draws to the observations compared to the background. This study instead is 
focused primarily on how the observation errors themselves impact qualities of the model 
analysis and forecast fields. 

There are many unanswered quantitative and qualitative questions about how obser- 
vation error impacts the errors of analysis and subsequent forecasts given that the DAS 
is designed as an error filter and smoother ( Daley [1991]). Modern DAS are based on 
elegant mathematical theory, as oulined in the Appendix, that unfortunately offers only 
limited insight into answers to these questions because of the many unsupported assump- 
tions generally implied for their computationally efficent application. Answers are also 
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not forthcoming when using real observations since in that context the true state being 
analyzed is not sufficiently well known. In contrast, an Observing System Simulation 
Experiment (OSSE) alleviates many of these difficulties since relevant errors can be di- 
rectly calculated from the accurately known truth provided ( Errico et al. [2013]). As long 
as the OSSE is a faithful simulation of reality, it can provide valuable insight into these 
questions. 

An OSSE suitable for this problem has been developed at the National Aeronautics and 
Space Administration (NASA) Global Modeling and Assimilation Office (GMAO; Errico 
et al. [2013], Prive et al. [2013]). It provides a tool for investigating how errors in sources 
of information or algorithms impact the analysis, background, and forecast errors. In 
addition, the observation errors in an OSSE can be directly manipulated to explore the 
impact of observation error on the analysis quality and forecast skill. In this work, a series 
of experiments with varied observation error are performed using the GMAO OSSE to 
explore the influence of observation error in an operational numerical weather forecasting 
system. 

The motivating factors for this study include both the design of OSSEs and the effects 
of observation error when assimilating real observations. The development of realistic 
observation errors for synthetic observations in OSSEs has been a challenging problem for 
decades. Here, the importance of accurately representing observation errors is investigated 
by testing the respsonse of the OSSE framework to a range of observation error magnitudes 
from minimization of observation errors to gross overestimation of observation errors. 
A variety of metrics are employed, including explicit measures of analysis error. The 
importance of proper weighting of error covariance matrices is also explored. 
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Details of the GMAO OSSE framework and the experimental setup are given in Sec- 
tion 2. The influence of observation error on increment and error statistics of the data 
assimilation products is described in Section 3. Likewise the effect of observation error on 
forecast skill is presented in Section 4 and on observation impact metrics calculated with 
an adjoint model in Section 5. Finally, the results will be discussed in Section 6. 

2. Setup 

The GMAO OSSE framework is used for all experiments. This system is described 
in detail by Errico et al. [2013]; a brief synopsis will be given here. An OSSE consists 
of several components: a long, free model integration called the Nature Run (NR) that 
represents the ‘truth’; a set of synthetic observations produced from the Nature Run 
fields for all data types currently assimilated to create initial conditions for numerical 
weather prediction; an observation error algorithm to add otherwise missing instrument 
and representativeness errors to observations; and a data assimilation system employing 
a second forecast model for ingesting the synthetic observations. 

The NR used for the GMAO OSSE was generated by the European Centre for Medium- 
Range Weather Forecasts (ECMWF) using the c31rl version of their operational forecast- 
ing model. The model was freely run from 01 May 2005 to 31 May 2006 at T511 resolution 
with 91 vertical levels and 3-hourly output. Prescribed boundary conditions included the 
sea surface temperature and sea ice content observed during the NR period; all other 
fields were generated by the ECMWF model. The NR has been evaluated to ensure that 
the model characteristics are suitable for use in OSSEs ( Reale et al. [2007], McCarty et al. 
[ 2012 ]). 
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Synthetic observations were created at the GMAO for both conventional and radiance 
data types. Conventional data were computed by interpolating the NR fields according 
to the temporal and spatial locations of archived observations from corresponding dates 
during 2005-2006. Radiance observations were similarly generated using the Commu- 
nity Radiative Transfer Model version 1.2 (CRTM, Han et al. [2006]) with a simplified 
treatment of the clouds based on cloud fractions from the NR. 

A set of baseline observation errors were calibrated to match some assimilation statis- 
tics of real data ingested into the same versions of GSI and GEOS-5. Uncorrelated errors 
were added to all observation types and an additional component of correlated errors was 
added to some types. Vertically correlated errors were added to conventional sounding 
data types, horizontally correlated errors were added to AMSU, HIRS, and MSU observa- 
tions, channel correlated errors were added to AIRS, and both vertically and horizontally 
correlated errors were added to satellite wind observations. No correlation of errors was 
applied between different data types, and no observation error bias was added. The obser- 
vation errors were callibrated so that covariances of observation innovations and variances 
of analysis increments in the OSSE matched corresponding statistics computed for the 
DAS applied to real observations ( Errico et al. [2013]). As a result of this tuning, the 
added errors may contain compensations due to mismatches between the OSSE and real 
observation results of actual background error covariances. 

In addition to explicitly added errors, the synthetic observations contain a small but 
unspecified quantity of implicit representativeness error. This error arises from differences 
between interpolations used to create the synthetic observations applied on the NR and 
DAS model grids. Errors are also introduced to the radiance observations through dif- 
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ferences between treatments of cloud in the radiative transfer schemes applied to the NR 
and DAS gridded fields. 

The numerical weather prediction model used for the OSSE experiments is the Goddard 
Earth Observing System Model, Version 5 (GEOS-5) with Gridpoint Statistical Interpo- 
lation (GSI) data assimilation system ( Kleist et al. [2009], Rienecker et al. [2008]). The 
model resolution is 0.5° latitude and 0.625° longitude with 72 vertical levels. The behavior 
of the OSSE forecasts has been validated in comparison to reality by Prive et al. [2013], 
where it was found that the forecast skill of the OSSE is slightly better than for real data, 
but the relative impact of different data types is well represented. 

For these experiments, the OSSE is cycled from 15 June 2005 to 05 August 2006, with 
120 hour forecasts launched daily at 0000 UTC. The first two weeks are discarded as a 
spin-up period, and results are calculated only for the month of July. Three experimental 
cases are tested: a Control case using the baseline set of synthetic observations with 
calibrated observation errors described by Errico et al. [2013]; a Perfect case in which no 
errors are added to the synthetic observations; and a case in which observation errors with 
standard deviation twice the magnitude as the Control case are added to the synthetic 
observations, called the Double case. The explicitly added errors in the Double case are 
perfectly correlated to the errors in the Control case, with twice the magnitude. Table 1 
displays the attributes of all of the experimental cases included in this study. These three 
cases can be compared to show the progression of the effects of observation errors as the 
errors are increased from near zero to large values. 

For Perfect, Control, and Double cases, the background and observation error covari- 
ances assumed by the GSI are not altered from the operational values. This preserves the 
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GSI Kalman gain matrix and thus the weightings between observations and background. 
For none of these three OSSE experiments is this Kalman gain truly optimal since the 
assumed error covariances are not the actual ones. Even for assimilation of real observa- 
tions, the specified background error covariance likely differs from the actual covariances 
for some components and the specified observation error ignores significant correlations 
known to exist for some observation types and instead grossly inflates the assumed er- 
ror variances to partly compensate for this neglect. For the Perfect and Double cases, 
the departures from optimality may be greater, but even in these cases more optimality 
would require use of a retuned assumed background error covariance. Such retuning would 
partly offset use of a more appropriate assumed observation error variance. For any of the 
experiments, assumption of truly accurate error covariances would produce the optimal 
analysis; i.e. , analysis with minimum expected error variance given the observation and 
background errors. Results from these experiments therefore provide an upper bound on 
what the corresponding optimal error variances would be. 

An additional experiment is performed using the added observation errors from the 
Double case, but with the standard deviations of observation errors used by the GSI 
increased by a factor of two, denoted as the ‘Double GSI Adjusted’ case. While this also 
does not result in an identical match between the true observation error covariances and 
the GSI error covariances, some underestimation of observation error covariances by the 
GSI in the Double case should be relieved in this case. A case with greatly reduced GSI 
error using the synthetic observations with no explicitly added error is not performed due 
to concerns that the data assimilation algorithm would become ill-conditioned. 
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For validation of certain analysis and forecast statistics, a parallel case is run using 
archived real data from the same time period instead of the synthetic observations. This 
case is designated as Real and is run using the same GEOS-5 and GSI version and settings 
as deployed in the OSSE. The analog of the Real case in the OSSE environment is the 
Control case, as the explicitly added observation errors in the Control case have been 
calibrated to specifically match the observation innovations and analysis increments in 
the Real case. A ‘Real Plus Error’ case is performed analogously to the Double case, 
wherein errors of the real observations are increased by explicitly adding errors with the 
same covariances used in the Control case to the real data. In this case, the observation 
error covariances are not expected to be identical to those used in the Double case, but 
the impacts of significantly increasing the observation error may be checked to ensure that 
the OSSE results are not unrealistic. 

The background error covariances used by the GSI are taken to be the operational 
2011 GSI/GEOS-5 covariances for all experiments. Due to improvements in the observing 
network between 2005 and 2011, these background error covariances may underestimate 
the true background errors when working with the 2005 observational dataset. In addi- 
tion, the true background error covariances may differ between experimental cases due to 
ingestion of different qualities of observation errors. 

3. Analysis Quality 

The observation innovation, di , measures the differences between observations and the 
background state, 

di = y°i ~ Hi[x. f (ti))\ (1) 
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where t t is the time, y ° is the observation vector, x-^ is the forecast model state vector, 
and H is an observation operator in standard notation [Ide et al, 1997]. Observation 
innovation statistics are expected to be strongly affected by the magnitude of observation 
errors, as y° is directly affected by observation error and x-^©) is indirectly affected by 
observation error that has been ingested in earlier cycles of the DAS. 

The analysis increment, or analysis minus background (x 0 ©) — x-^©)), measures the 
amount of ‘work’ done by the data assimilation system in generating an analysis state from 
the initial background state. The root-mean-square-error (RMSE) of such a difference is 
calculated as an areal and temporal mean 


RMSE ! = 


N 


Sill It - xf(tj)) 2 R 2 e cos (j)d(j)d\ 


( 2 ) 


N f\* J$: Rl cos 4>d(j)d\ 
where x a is the analysis field and Xf is the background field for N analysis states, R e is 
the radius of the earth, 0 is the latitude between cj) s and cf> n and A is the longitude between 
A,,, and A P . 


Figure 1 shows a sampling of global variances of observation innovation for the Perfect, 
Control, and Double experimental cases for rawinsonde (RAOB) temperature and wind, 
GOES infrared (IR) cloud drift winds, and AMSU-A brightness temperatures. The vari- 
ance of observation innovations for the Control case is intermediate to that seen for the 
Perfect and Double cases. 

If the true error covariances of the background, B, were the same for the three test cases, 
and if the explicitly added observation errors are uncorrelated with the background errors, 
then the difference in variances of observation innovation between each pair of cases is 
simply the difference in the variances of the observation errors themselves. As the standard 
deviation of the observation error in the Double case is twice the standard deviation of 
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the observation error in the Control case, it would be expected that the difference in 
variance of observation innovation between the Double and Perfect cases would be four 
times as large as the difference between the Perfect and Control cases. This expected 
relation between observation innovation variances in the three experimental cases is seen 
for RAOB temperatures and winds and for AMSU-A in Figure 1, implying that changes 
to the background error covariances are relatively small. 

Results for GOES IR cloud drift winds show too large a difference between Perfect and 
Double observation innovation variances compared to Control and Perfect in the lower 
troposphere, and too small a difference in the middle and upper troposphere compared 
to the expected ratio of differences. In the upper troposphere, the ingested observation 
counts for the GOES cloud-drift winds are 20-30% smaller in the Double case than in the 
Perfect case, indicating that the quality control of the GSI has acted to remove some of 
the observations with very large observation errors. Thus, the observation error variance 
of the accepted observations is smaller than the variance of the observation errors applied 
to the entire dataset for the Double case, reducing the difference between the Perfect 
and Double cases. In the lower troposphere, the larger than expected difference between 
the observation innovation variance for the Perfect and Double cases indicates that the 
background error of the Double case may have increased significantly between the Perfect 
and Double cases in this region. Examination of the background error fields (not shown) 
does indicate a significant increase in background error in the zonal wind field at low 
levels. 

When the observation error is increased, the spatial distribution of the analysis incre- 
ment variance is retained as the magnitude of the variance increases. This is illustrated 
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in Figures 2 and 3 for the square roots of the zonal means of the temporal variances of 
analysis increments of temperature and zonal wind respectively. The analysis increment 
variance of the Control case has been calibrated to emulate the Real analysis increments; 
the Double case has greater variance than Real and the Perfect case significantly lower 
variance than Real. The change in the variance of analysis increment between Perfect 
and Double is on the order of 30-50% increase in the upper troposphere and 25-100% in- 
crease in the lower troposphere. The relative impact of observation errors on the analysis 
increment is considerably smaller than the impact seen on the observation innovation as 
expected since the data assimilation algorithm acts as a filter and smoother of observation 
errors [Daley, 1991]. 

The change in the error of the model state due to assimiliation of observations is mea- 
sured by taking the difference of the absolute value of the analysis error and the absolute 
value of the background error, 


. , _ , p , _ ft ft Sill (|s a (*») - 2 %) I - \x f (tj) - x\tj)\)Rl CQS(j)d(f)d\ 

6 6 N ft ft cos 4>d(j)d\ 

as in (2) where x* is the true Nature Run state. This metric is selected because it indicates 
whether the change introduced by the data assimilation process works to improve the 
analysis, to degrade the analysis, or if the net impact is neutral. Negative values indicate 
an improvement of the state due to assimilation of observations, while positive values 
indicate a degradation of the state. 

The monthly mean of \A e \ — \B e \ for July is shown in Figure 4. For the temperature 
field, the assimilation improves upon the background state throughout the troposphere, 
and the observation errors do not strongly affect the magnitude of improvement. However, 
the wind fields show a much stronger response to the observation error, with significantly 


©2013 American Geophysical Union. All Rights Reserved. 


Accepted Article 


different results for the Perfect, Control, and Double cases. While the greatest improve- 
ment in the model state is seen for the Perfect case, the Control case also shows overall 
improvement due to observation assimilation. For the Double case however, the observa- 
tions in the middle and lower troposphere tend to cause a degradation of the background 
wind held, resulting in a lower quality analysis than if the observations had not been 
assimilated; this is most notable in the Northern Hemisphere and the tropics. This degra- 
dation of the background state ideally should not occur if the background and observation 
error covariances used by the DAS were correct; in the Double case it is known that the 
actual observation error variances are greater than the variances used by the GSI for some 
data types. 


The RMSE of the analysis is calculated for July 
RMSE a = 


Dili ft I^( xa {U) ~ x t {t i )) 2 R? e cos <t)d(j)d\ 


( 4 ) 


M N ft ft R 2 e cos 4>d(j)d\ 

as in (3), plotted for temperature and zonal wind in Figure 5. Only a minor difference 
(2-3%) is seen in this analysis error statistic between the Perfect and Control cases for 
temperature, but a slightly larger increase in temperature error (5-10%) for the Double 
case is noted, with similar levels of change in the tropics and extratropics. The analysis 
error for zonal wind shows a larger spread between experiments, with a 5-10% increase 
in error in the Control compared to the Perfect case, and a 10-30% increase in analysis 
error between the Control and Double cases. The greatest percent change in error of the 
analysis wind held is found in the Northern Hemisphere extratropics, and the least change 
in the tropical mid and upper troposphere. The large change in the Northern Hemisphere 
extratropical wind held error is consistent with the finding that the data assimilation 
process acts to degrade the winds in this region for the Double case (Figure 4). 
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As previously described, the Double Adjusted GSI case is performed with the same 
observation errors used in the Double case, but with the standard deviations of observation 
errors used by the GSI multiplied by two. The results from this case do not show a marked 
improvment in analysis skill compared to the Double case; instead there is a small increase 
in analysis error for wind and temperature in the Southern Hemisphere extratropics (thin 
solid line in Figure 5). Comparing the dashed and thin solid lines in Figure 4 shows the 
improvement of the analysis state compared to the background state is nearly the same 
in the Double and Double Adjusted GSI cases. 

A discussion of the impacts of mismatched true observation error and DAS-assumed 
observation errors is given in the Appendix. One cause of the increased analysis error in 
the Double Adjusted GSI case is persistent model error due to differences in the preferred 
climatology of the ECMWF Nature Run and the GEOS-5 models. Because the assimi- 
lation does not draw as strongly to the observations in the Double Adjusted GSI case, 
in regions where there is a large difference in the model climatologies, the analysis state 
retains more of this GEOS-5 model ‘bias’ than the Double case. The error covariances 
for both background and observation errors are not ideal for either the Double or Double 
Adjusted GSI cases. In the Double Adjusted GSI case in particular, the background error 
covariances may be underestimated, resulting in an analysis that is drawn too strongly to 
an erroneous background. 

The spatially averaged monthly mean correlations r of the analysis error fields between 
the Control and Perfect, Control and Double, and Perfect and Double case pairs are 
calculated as 

A e ((f>, A) = x a (4>, A) — x\(f), A) (5) 
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ZtMeM, A) - A el {<t>, A))(A e2 (0, A) - A e2 (< i>, A)) 


( 6 ) 


( 7 ) 


with notation as in (4), with the overbar indicating a time mean. The correlations of the 


analysis error fields shown in Figure 6 are fairly high overall, particularly near the surface 
for temperature. This implies that model error growth contributes significantly to the 
total analysis error field while the observation errors and their growth do not dominate 
the total error. If the observation errors introduced in the current cycle were a large source 
of analysis error, the correlation between the Control or Double cases would be expected 
to be larger than the correlations between the Perfect case and either of the Control of 
Double cases. This is because the added observation errors in the Control and Double 
cases are identical except for a proportionality factor. The magnitude of the correlations 
of the analyses for the Control versus Perfect and Control versus Double cases are very 
similar, implying that the dominant differences in the analysis error fields are due to the 
growth of observation and model errors from previous cycles, and that the immediate 
contribution of observation error from the current cycle is modest. This is consistent with 
the data assimilation design property that acts to filter spatially uncorrelated observation 
errors, which are the dominant type of observation error. 

4. Forecast Skill 

Forecast skill in the midlatitudes is often measured by the anomaly correlation of 500 
hPa geopotential. Anomaly correlation coefficients are calculated for the 120 hour fore- 
casts starting at 0000 UTC from 02 July to 30 July 2005 for each experimental case. The 
resulting monthly means and standard deviations of anomaly correlations are listed in 
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Table 2. A Wilcoxon paired test p-value indicating the probability that the null hypoth- 
esis is true is calculated to determine if the mean anomaly correlation of an experiment 
is different from the Control case mean; values of p < 0.05 indicate significance at the 
95% level. With once-daily forecasts on sequential days, the anomaly correlation scores 
may be serially correlated in time. The autocorrelation r in Table 2 gives an indication of 
the degree of serial correlation. For most comparisons that show statistically significant 
results at the 95% level, the autocorrelation is small or even negative, indicating that the 
results of the Wilcoxon paired test are valid [Yue and Wang, 2002], 

The five-day anomaly correlations show an overall insensitivity of forecast skill to ob- 
servation error. When the Perfect case is compared to the Control case, there is a slight 
improvement in the Southern Hemisphere anomaly correlation that is statistically signifi- 
cant, but no improvement is seen for the Northern Hemisphere skill. When the observation 
error is increased further in the Double case, a reduction in anomaly correlation is seen 
in both hemispheres, but the reduction is only significant at the 95% level in the North- 
ern Hemisphere. The reduction in anomaly correlation compared to the Control for the 
Double case is larger than the difference in anomaly correlation between the Perfect and 
Control cases (range of 0.02-0.03 in comparison to 0-0.01). 

The 120 hour forecast anomaly correlations for the Real and Real Plus Error cases are 
also given in Table 2. A slight decrease in forecast skill is seen in the Northern Hemi- 
sphere for the Real Plus Error compared to Real case, but this decrease is not statistically 
significant. A larger decrease in forecast skill is seen in the Southern Hemisphere, statis- 
tically significant at the 95% level, although the serial correlation is relatively high, which 
may result in overinflated significance estimates. The influence of observation errors on 
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forecast skill for the real data is similar to that seen in the OSSE; i.e., a relatively small 
degradation of anomaly correlation scores between 0.01 and 0.03. 

The root-mean-square forecast error at 120 hours verified against the Nature Run is 
calculated for the month of July as with the analysis error: 


RMSE f = 


\ 


ft ffc( xf (fi) - x^U)) 2 ^ cos(pd(f)dX 


(8) 


N ft ft R l cos0d(pd\ 

where there are N forecasts, and other variables are as in (4). Forecast error is plotted as 
a function of vertical level for temperature and zonal winds in Figure 7. In the tropics, 
there is no discernable difference in the forecast skill between the Perfect, Control, or 
Double cases. The Northern Hemisphere shows no difference in skill between the Perfect 
and Control cases, but an increase in error of 5% for the Double case. Only in the 
Southern Hemisphere is there a clear, but small, progression of forecast skill degradation 
as the observation error increases 3-4% from the Perfect case to the Control case and then 
increases an additional 4-8% from the Control to the Double case. 

The spatial correlation of the 120 hour forecast error fields is calculated as in (5) but 
using x * instead of x a as a function of model level for three pairings: Perfect and Control, 
Control and Double, and Perfect and Double; the results are plotted in Figure 8. The 
correlations between the pairing Perfect and Control and the pairing Control and Double 
are generally in the range of 0.7 to 0.75 throughout the troposphere, while correlations 
are lower, near 0.6, for the pairing Perfect and Double. To put this in perspective, a wave 
that is forecast to be 53° out of phase will have a correlation of 0.6. 

When the forecast error correlations are compared with the analysis error correlations 
(Figure 6), several differences are noted. First, in the midlatitudes, the correlations in the 
lower troposphere are smaller for the forecast error compared to the analysis error. At 
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the analysis time, the near-surface error is likely to be dominated by representativeness 


error and mismatches in model orography and boundary layer treatment between the 
GEOS-5 and Nature Run, resulting in very high correlations between the three cases. 
During forward model integration, some errors increase nonlinearly, resulting in smaller 
correlations at the 5-day forecast time. 

In the middle and upper troposphere, the 120 hour forecast errors have slightly higher 
correlations between cases than the analysis error fields. At these levels, representativeness 
errors play a smaller role at analysis time and random observational error a larger role. 
During model integration, some errors are damped or destroyed by model processes, while 
other errors project onto unstable modes of the atmospheric state and grow with time. It 
is anticipated that as the forecast length is extended beyond 120 hours, the forecast error 
correlations would eventually decline and asymptote to a small positive number. 

The vertically integrated dry energy norm (DEN, Errico [2000]) is calculated for each 
experimental case and plotted as a function of forecast time in Figure 9. 


the truth, p s is the surface pressure and p t is the pressure at the top of the chosen volume, 


heat of dry air, and T r — 286 K is a reference temperature. The small contribution to 
DEN from surface pressure perturbations included in the more usual definition of the dry 
energy norm is neglected from (9). 


flattens out after 48 hours before increasing again after 96 hours, while the extratropical 


DEN = 



( 9 ) 


2 it it R 2 e cos{cj))d\d(j)dp 


as in (4), where u, v, and T are the perturbations of the wind and temperature fields from 


here taken to be the model level closest to 72 hPa, c p — 1005 J kg 1 K 1 is the specific 


The error growth in the tropics (Figure 9e) shows initial rapid growth of error that then 
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error growth is initially slow and then accelerates with forecast time. Comparing the 
Control and Perfect cases, the difference in DEN declines or remains steady as the forecast 
progresses, with the Control case actually having lower DEN than the Perfect case by 
96 hours in the Northern Hemisphere. The Control versus Double case shows greater 
difference in DEN, but this difference likewise decreases with time. It is expected that if 
the forecast period were lengthened, the DEN would eventually saturate and the difference 
in DEN between cases would approach zero [Leith, 1974]. 

5. Observation Impact 

One set of metrics that are often of great interest when performing an OSSE is the data 
impacts of various observation types. For the GEOS-5 model, a dry adjoint is available 
that can be used to efficiently determine estimates of these impacts on the 24-hour forecast 
[ Gelaro and Zhu, 2009] using DEN as the norm. Figure 10 compares the observation 
impacts for a variety of observation types in the Perfect, Control, and Double cases. 
A negative impact indicates a reduction in the 24 hour forecast error. The observation 
impact is calculated using the Nature Run fields to verify the 24-hour forecasts, and not the 
analysis fields that are often used for real observations. The differences between verifying 
the observation impact against the Nature Run instead of the analyses are generally minor, 
although with verification against the Nature Run rawinsonde temperature observations 
have a significantly larger impact. 

The overall observation impacts seen in Figure 10 show expected behavior, with a few 
exceptions. Radiance observations dominate the impact for the Southern Hemisphere 
extratropics, with conventional data playing a strong role in the Northern Hemisphere 
extratropics. AMSU-B and conventional moisture observations show minimal impact 
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due to the dry metric used for the adjoint calculations as well as the omission of moist 
processes from the adjoint model itself. The anomalous finding of detrimental AMSU-A 
impacts in the tropics is due to a known deficiency in this version of the GEOS-5, where 
the geostrophic coupling implied by background error correlations is improperly specified 
near the equator. 

The observation impact is a noisy metric, and with only a one-month cycling period, the 
differences between individual observation impacts for the three cases are not statistically 
significant at the 95% levels. The total impact of all data types is also calculated for each 
of the three cases and shown in Table 3. In the Northern Hemisphere extratropics and 
tropics, there is not a statistically significant difference between the three cases, but the 
Southern Hemisphere has a statistically significant greater total observation impact for 
the Double case compared to the Control and Perfect cases. 

Observation impacts can be increased by two causes. One is that an observation has 
less error or is better utilized so that the expected reduction of analysis error is greater. 
Another is that the background error is greater so that the observation is allowed to cor- 
rect more. Greater background error can result from an increase of observation error, 
especially when all observation errors are increased simultaneously. This last relationship 
may mitigate the reduction of beneficial impacts by worsening observations in this way 
because observations are thereby allowed to do more ‘work’. Since the background is 
affected by forecast model error in addition to observation errors, a portion of the back- 
ground error covariance will remain unchanged as observation errors are altered. Thus the 
mitigation effect as described should itself be reduced by the presence of model error. If 
the observation error characteristics of a single observation type were changed while keep- 
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ing the error characteristics of all other observtion types unchanged, the relative impact 
of different observation types might undergo significant changes. 

6. Discussion 

Observation errors have a notable impact on the amount of ‘work’ done by the data 
assimilation system. Unsurprisingly, the observation innovations and (to a lesser degree) 
analysis increments show significantly increased variance when observation error variances 
are increased. Observation innovation d is changed both directly by the observation error 
ingested in the current cycle and indirectly by alterations to the forecast skill from the 
previous cycle, as 

(dd T ) ~ R + HBH t (10) 

where R and B are, respectively, the actual observation and background error covariances 
that may differ from the corresponding matrices used by the DAS. One notable result of 
these experiments is that changes to the forecast x? are relatively small when R is altered 
by a large fraction. In the OSSE, the observation errors are not temporally correlated, 
so the forecast error that evolves from the previous assimilation cycle is not correlated 
with the observation error of the following cycle. In reality, some observation errors may 
be temporally correlated [Daley, 1992], although this is not accounted for by the GSI. 
The data assimilation process tends to damp out observation errors, particularly spatially 
uncorrelated errors. 

The analysis increment statistics show significant influence from observation error. How- 
ever, the impact of observation error on the analysis increment is considerably smaller than 
the impact on observation innovation, due to the very effective filtering of spatially un- 
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correlated observation errors by the GSI algorithm. The effect of observation errors on 
the analysis error is smaller than the effect on analysis increment since the increment is 
designed to only reduce the error in a statistical sense; i.e., not everywhere at every time. 
If only a single data type is available in a region, the portion of the observation error 
that is correlated will have the greatest impact on the analysis quality. If multiple data 
types are available and the observation error is not correlated between data types, as in 
the OSSE, then the impact of spatially correlated error will also be reduced. As the data 
network becomes more sparse, the role of uncorrelated error increases, as there is less 
opportunity for uncorrelated errors from nearby observations to compete. 

In a statistically stable assimilation system, an equilibrium must be obtained that bal- 
ances the competing effects of model error, assimilated observation error, error growth or 
damping between cycle times, and the ingestion of useful information from observations. 
Usually this implies that the improvement to the analysis by ingesting observations is 
balanced by the subsequent error growth during the forecast that creates the next back- 
ground [ Daley and Menard , 1993]. In the Double case, this equilibrium is apparently 
more complex since the analysis increments for some fields in some regions of the globe 
actually increase the analysis error with respect to the background error on average. 

The wind and temperature analysis fields show different responses to observation er- 
ror, with a considerably stronger response to increased observation errors in the wind 
analysis field. While the conventional data types have fairly similar temporal and spa- 
tial distributions of temperature and wind observations (with the exception of satellite 
winds), the distributions of satellite radiances differ significantly from that of satellite 
winds. Satellite winds are associated with clouds or water vapor features but infrared ra- 
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diance observations for channels that peak low in the atmosphere are absent from cloudy 
regions. Data impacts can be greater in the Southern Hemisphere both because it is winter 
during the experimental period, implying greater variances and synoptic-scale baroclinic- 
ity and therefore greater error variances, and because there are fewer strongly-weighted 
conventional observations. 

As the model integrates forward in time, only a small portion of the initial errors 
experience growth. Some errors, particularly those with small spatial scales, may be 
effectively filtered out by the model. Most errors will project onto modes that are damped 
or that experience only very slow growth, but a fraction of errors will project onto modes 
that grow rapidly [Ehrendorfer and Errico , 1995]. Regional variation is seen for the impact 
of observation errors on the forecast skill, reflecting the differences in both the dynamics 
of error growth and the nature of the observational network around the globe. In the 
tropics, the initial error growth rate is very high due to convective processes [Hodyss and 
Majumdar, 2007] but these errors saturate quickly on a local scale. Thus, the forecast 
skill in the tropics is almost completely insensitive to observation errors, as these errors 
are rapidly overwhelmed by those in the model physics. 

In the midlatitudes, error growth is modest and localized during the first day of the 
forecast, but the rate of error growth then increases during the second and third day as 
the errors spread into the mesoscale and synoptic scales. Errors in the midlatitudes do 
not saturate within the five day forecast [Hodyss and Majumdar , 2007]. The significant 
differences seen in the extratropical analysis error in the three test cases are muted in the 
120 hour forecast error fields. 
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There are several factors that influence the observation impact when observation errors 
are increased. The magnitude of the observation impact indicates the amount of ‘work’ 
done by the observations when adjusting the background held. If the background held 
had no error, there would be no possible improvement, and the observation impact would 
be zero or detrimental to the model state. In a properly functioning data assimilation 
system, the net (average) influence of observations should be to improve the quality of 
the analysis compared to the background held, although many of the observations may 
have a neutral or detrimental effect on the analysis state [ Gelaro et al, 2010]. 

When the analysis error is increased due to ingestion of greater observation errors, these 
additional errors grow during forward integration and increase error in the background 
held of the following cycle time. The total observation impact may then be increased 
as there is more ‘work’ to be done to correct the background held, even though the 
observations themselves are degraded by larger observation error variance. The increase 
in observation impact seen in the Southern Hemisphere extratropics as the observation 
error is increased is an example of this effect. Although the analysis error is also increased 
in the Northern Hemisphere and tropical regions, the total observation impact is not 
signihcantly affected in these regions. It is speculated that this may be due to the more 
nonlinear growth of errors where convective processes play a strong role in the tropics and 
summer hemisphere. 

Although the OSSE framework allows for direct manipulation of the observation errors, 
there are some limitations of the system. One caveat of the Perfect observation case is that 
the observations are not completely free of error. While the observations in the Perfect 
case are drawn directly from the ‘truth’, there are intrinsic errors of representativeness due 
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to the difference in model resolution, and due to temporal interpolation that introduces 
errors. It is expected that these errors are much smaller than observation errors that 
occur in the real world because the spatial resolutions of the Nature Run and assimilation 
grids are not so very different. 

When the observation and background error covariances specified in the GSI are not 
the true covariances, the DAS results are sub-optimal. The specified covariances are 
only approximations to the true ones whether the GSI is applied to real observations or 
the OSSE context (e.g., the true observation error covariace is definitely not diagonal as 
assumed by GSI). Although the degrees of approximation may differ, for the OSSE Control 
case, the added observation errors were tuned in an attempt to make various performance 
statstics similar to those for the Real case, and thus the degrees of sub-optimality of those 
two cases may be similar. For the other experimental cases, including the Double GSI 
Adjusted case, this is likely not true. In any case, however, the skill metrics obtained 
should be considered simply as upper-bounds on what their values would be were GSI 
tuning truly optimal. 

A caveat of these experiments is that the added observation errors may not have com- 
pletely realistic characteristics. Although the synthetic observation errors have been ex- 
tensively calibrated, it is possible that some errors have been adjusted in ways that are not 
realistic in order to compensate for other deficiencies of the OSSE. For example, synthetic 
bias has not been added to the observations because the portion of bias that is assumed 
by the DAS is removed by its bias correction algorithm. However, bias that is less well 
known likely exists in reality, but this bias is difficult to simulate precisely because it is 
not well observed or understood. 
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One motivation of this study was to determine if it is possible to manipulate the obser- 
vation errors in order to ‘calibrate’ the forecast skill statistics of the OSSE system. The 
results show that unrealistically large increases in the observation error would be neces- 
sary in order to appreciably change the forecast skill of the OSSE. In fact, one implication 
is that if the only metrics of interest for a particular OSSE are the forecast skill and ob- 
servation impacts, the synthetic errors may be eliminated entirely with little effect on the 
experimental results. However, if the analysis quality, observation innovation, or analysis 
increments are of concern, the observation errors must be carefully calibrated. This result 
may depend on the amount of model error in the OSSE system, and it is possible that 
observation error may play a stronger role in the forecast skill of a fraternal or identical 
twin experiment, where model error is minimal. 

This work also quantifies the effects of significant mismatches between the actual ob- 
servation error covariances and the error covariances assumed by the data assimilation 
system. Decreasing the actual observation error covariances while holding the DAS ob- 
servation error covariances constant results in modest reductions in the total error of the 
analysis state, but the effects on the forecast skill are minimal. 

Appendix A: Theoretical relationships among errors 

Some simple relations between the analysis error and the errors of the background state 
and the ingested observations can be found both for the ‘ideal’ case in which the error 
covariances employed by the DAS are accurate, and for the more realistic case in which 
there is a mismatch between the true error covariances and the covariances assumed by 
the DAS. 


©2013 American Geophysical Union. All Rights Reserved. 


Accepted Article 


The analysis state x a can be expressed as 


x a = x 6 + K [y 0 - H (x fe )] (Al) 

where the background state x*, is adjusted by the ingestion of observations y c using the 
operation operator H and the Kalman gain K. The gain is expressed as 

K = (B 1 + HTR^H) 1 H T R 1 = BH t (HBH T + r) 1 (A2) 

where B and R are the specified, but not necessarily true, background error and obser- 
vation error covariance matrices and H is a linearized form of H . 

Define errors of the analysis state, e a , the background state, e&, and the observation 
errors, e Q in relation to the true state x t as defined in the analysis subspace. 


e a = x a - x t 

(A3) 

e 6 = x fc - x t 

(A4) 

e 0 = y 0 - R(x t ) 

(A5) 


Note that e 0 includes both instrument and representativeness errors and has a different 
length (is defined in a different mathematical space) than the vectors e a or e*,. 

e a = e fc + K [e c - He fe ] (A6) 

Assuming that observation and background errors are uncorrelated, and noting that KH 
is symmetric, covariances of the analysis error can be constructed as 

(e^) = (I — KH)(e t eJ)(I - KH) + K(e 0 eJ)K r (A7) 

where the angle brackets indicates a sample mean or expectation based on that sample. 
If the B and R assumed by the DAS are the true ones, then the K employed is the 

optimal one, yielding the optimal analysis error covariance A. The true covariances 
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corresponding to these prescibed ones will be denoted by a tilde: 


<«*£> = A (A8) 

M) = B (A9) 

(e 0 ej) = R (A10) 

These are related by 

A = (B ” 1 + H t R- 1 h )“ 1 (All) 

A = A [b _ 1 BB _1 + H T R “ 1 RR" 1 h] A (A12) 


It can be seen that if B = B and R = R, then A = A. 

First, consider the ideal case where B = B and R = R, for which the data assimilation 
system performance is expected to be optimal [Daley, 1991]. In a cycling data assimilation 
procedure such as GSI, B is actually an implicit function of R since it depends on the 
quality of the previous analysis. Thus, increasing R is expected to increase B and thereby 
further increase A to some degree. If B also reflects a sizeable contribution by forecast 
model error, as it generally does in practice, then the additional influence on A through 
B changes will be diminished. 
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Figure 1 . Variance of observation innovation for July 2005. a) rawinsonde temperature obser- 


vations; b) rawinsonde zonal wind observations; c) 
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Figure 2. Square root of the zonal mean of temporal variance of analysis minus background 
T, K for July 2005. a) Perfect, b) Control, c) Double, d) Real. 
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Description of all OSSE cases included in this manuscript. Data types are synthetic (OSSE) 
or real (archived observations). “Added Obs Err a” refers to the standard deviation of synthetic 
'bservation error explicitly applied to either real or synthetic observations, with “standard” the 
alibrated observation error standard deviations calculated as in Errico et al. [2013]. “GSI Obs 
rr cr” refers to the standard deviations of observation errors used by the GSI data assimilation 
ystem, with “operational” the values used in the operational version of the GSI from 2011. 
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igure 3. Square root of the zonal mean of temporal variance of analysis minus background 
zonal wind, m s _1 for July 2005. a) Perfect, b) Control, c) Double, d) Real, 
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July 2005 monthly mean and standard deviation (a) 500 hPa geopotential anomaly corre- 
ion coefficients at the 120 hour forecast. Wilcoxon paired rank test p indicating significance 
el that the mean anomaly correlation is different from the Control case mean (for Perfect and 
Don ble) or different from the Real case mean (for Real Plus Error). Autocorrelation r of the 
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July 2005 monthly mean total observation impact for all data types, calculated for the dry 
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igure 10. Adjoint calculations of observation impact on dry energy norm. White bars, 
erfect case; grey bars, Control case; black bars, Double case; lines indicate 95% confidence 
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intervals. Note reversed direction of x-axis. Left, Northern Hemisphere extratropics; center, 
them Hemisphere extratropics; right, tropics. 
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